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Abstract. We study the problem of determining strongly connected compo- 
nents (Sees) of directed hypergraphs. The main contribution is an algorithm 
computing the terminal strongly connected components (i.e. Sees which do 
not reach any other components than themselves). The time complexity of 
the algorithm is almost linear, which is a significant improvement over the 
known methods which are quadratic time. This also proves that the prob- 
lems of (i) testing strong connectivity, (ii) and determining the existence of a 
sink, can be both solved in almost linear time in directed hypergraphs. We 
also highlight an important discrepancy between the reachability relations in 
directed hypergraphs and graphs. We establish a superlinear lower bound on 
the size of the transitive reduction of the reachability relation in directed hy- 
pergraphs, showing that it is combinatorially more complex than in directed 
graphs. We also prove linear time reductions from combinatorial problems on 
the subset partial order, in particular from the well-studied problem of finding 
all minimal sets among a given family, to the problem of computing the Sees 
in directed hypergraphs. 



1. Introduction 

Directed hypergraphs consist in a generalization of directed graphs, in which the 
tail and the head of the arcs are sets of vertices. Directed hypergraphs have a very 
large number of applications, since hyperarcs naturally provide a representation of 
implication dependencies. Among others, they are used to solve several problems 
related to satisfiability in propositional logic, in particular relative to Horn formu- 
las, see for instance [AT9T1 IAFFG971 IGP951 IGUPRM IPre03j . They also appear 
in problems relative to network routing [PreOOj . functional dependencies indata- 



base theory |ADS83j . model checking |LS98| . chemical reaction networks |6zt08 
transportation networks NP89, NPG98 , and more recently, algorithmics of convex 
polyhedra in tropical algebra lAGGlOi [All09a . 

Many algorithmic aspects of directed hypergraphs have been studied, in partic- 
ular optimization related ones, such as determining shortest paths NP89 ( N PA06| . 
maximum flows, minimum cardinality cuts, or minimum weighted hyperpaths (we 
refer to the surveys of Ausiello et al. [AFFOlj and of Gallo et al. [GLPN931 for a 
comprehensive list of contributions) . Naturally, some problems raised by the reach- 
ability relation in directed hypergraphs have also been studied. For instance, deter- 
mining the set of the vertices reachable from a given vertex is known to be solvable 
in linear time in the size of the directed hypergraph (see for instance [GLPN93] ) Q 

In directed graphs, many other problems are known to be efficiently solvable, 
e.g. in linear time, such as testing acyclicity or strong connectivity, computing the 



x In the sequel, the underlying model of computation is the Random Access Machine. 
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strongly connected components, determining a topological sorting among them, etc. 
Surprisingly, the analogues of these elementary problems in directed hypergraphs 
have not received any particular attention (as far as we know). Unfortunately, 
none of the direct graph algorithms can be straightforwardly extended to directed 
hypergraphs. The main reason is that the reachability relation of hypergraphs does 
not have the same structure: for instance, establishing that a given vertex u reaches 
another vertex v generally involves vertices which do not reach v. 
Contributions. In this paper, we tackle some reachability problems relative to di- 
rected hypergraphs and their strongly connected components. 

Section[3]presents an almost linear time algorithm able to determine the terminal 
strongly connected components of a hypergraph (a component is said to be terminal 
when it reaches no other components than itself) . As discussed below, this improves 
the existing quadratic approaches. This also shows that the following properties: 
(i) is a directed hypergraph strongly connected? (ii) does the hypergraph admit a 
sink (i.e. a vertex reachable from all vertices)? can also be determined in almost 
linear time. The algorithm proceeds by iterating two steps. The first one consists 
in finding some (terminal) Sees of an underlying directed graph. In the second 
step, each of these components is collapsed to a single vertex, which makes appear 
new arcs in the digraph underlying to the hypergraph. The two steps are carefully 
combined to gain efficiency. Moreover, an elaborate instrumentation is settled to 
determine the new arising arcs, without sacrificing the time complexity. A complete 
example of execution trace of the algorithm is provided in Appendix [A] 

Unfortunately, this algorithm cannot be extended to determine all strongly con- 
nected components with the same complexity. In fact, the contributions presented 
in Section [4] strongly suggest that the problem of computing the entire set of Sees 
is harder in directed hypergraphs than in directed graphs. In particular, we prove a 
lower bound result which shows that the size of the transitive reduction of the reach- 
ability relation may be superlinear in the size of the directed hypergraph (whereas 
this is linearly upper bounded in the setting of directed graphs) . We deduce a lin- 
ear time reduction from the minimal set problem to the problem of computing the 
strongly connected components. Given a family T of sets over a certain domain, 
the minimal set problem consists in determining all the sets of JF which are minimal 
for the inclusion. While it has received much attention, the best known algorithms 
are only subquadratic time. 

Related Work. Reachability in directed hypergraphs has been defined in different 
ways in the literature, depending on the context and the applications. The reach- 
ability relation which is discussed here is basically the same as in |ANI901 IAI911 
lAFFOlj . but is referred to as B -reachability in }GLPN93| IGP95j . It precisely cap- 
tures the logical implication dependencies in Horn propositional logic, and also the 
functional dependencies in the context of relational databases. Some variants of 
this reachability relation have been introduced, e.g. in which any hyperpath has 
to be provided with a linear order over the alternating sequence of its vertices and 
hyperarcs |TT09j . These variants are beyond the scope of the paper. 

As mentioned above, determining the set of the reachable vertices from a given 
vertex has been thoroughly studied. Gallo et al. provide in [GLPN93 a linear 
time algorithm. In a series of works [ANI901 IAT9II IAFFG97] . Ausiello et al. also 
introduce online algorithms maintaining the set of reachable vertices, or hyperpaths 
between vertices, under hyperarc insertions/deletions. 
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To our knowledge, other reachability problems, such as topological sorting, de- 
termining strongly connected components or terminal ones, have not been specif- 
ically studied so far. Naturally, they can be solved in polynomial time by using 
the algorithms previously mentioned. For instance, given a directed hypergraph H 
with n vertices, the whole graph of the reachability relation can be determined in 
0(ns\ze(H)) by n calls to the algorithm of GLPN93 . Computing the (terminal) 
strongly connected components of this graph precisely yields the (terminal) com- 
ponents of %. However, this approach is obviously not optimal: for instance, when 
T-L coincides with a directed graph, we know that the problem can be simply solved 
in linear time. 

Computing the transitive closure and reduction of a directed hypergraph has also 
been studied by Ausiello et al. in [ADS86 . In their work, reachability relations 
between sets of vertices are also taken into account, in contrast with our present con- 
tribution in which we restrict to reachability relations between vertices. The notion 
of transitive reduction in [ADS86] is also different from the one discussed here (Sec- 
tion 4.1 1. More precisely, the transitive reduction of |ADS86] rather corresponds to 
minimal hypergraphs having the same transitive closure (several minimality prop- 
erties are studied, including minimal size, minimal number of hyperarcs, etc). In 
contrast, we discuss here the transitive reduction of the reachability relation (as a 
binary relation over vertices) and not of the hypergraph itself. 

2. Preliminary definitions and notations 

A directed hypergraph is a pair (V, A), where V is a set of vertices, and A a set 
of hyperarcs. A hyperarc a is itself a pair (T, H), where T and H are both subsets 
of V. They respectively represent the tail and the head of a, and are also denoted 
by T(a) and H{a). Note that throughout this paper, the term hypergraph(s) will 
always refer to directed hypergraph(s). 

The size of a directed hypergraph H = (V, A) is defined as size('H) = |V| + 

E { tm K a(\T\ + \H\)- 

Given a directed hypergraph H — (V,A), and u, v € V, then v is said to be 
reachable from u in H, which will be denoted by u v, if u = v, or there exists 
a hyperarc a — (T, H) such that v € H and all the elements of T are reachable 
from u. This also leads to a notion of hyperpaths: a hyperpath from u to v in 
% is a sequence of p hyperarcs a\,...,a p G A satisfying T(a^) C U l ~L^H(aj) for 
all i = 1, . . . ,p + 1, with the conventions H(ao) — {it}, and T(a p +i) = {v}. The 
hyperpath is said to be minimal if none of its subsequences is a hyperpath from u 
to v. 

The strongly connected components (Sees for short) of a directed hypergraph % 
are the equivalence classes of the relation defined by u v if u ~~>% v and 
v ~^>-h u. 

If / is a function from V to an arbitrary set, the image of the directed hyper- 
graph % by f is the hypergraph, denoted /(H), of vertices /(V) and of hyperarcs 
{(f(T(a)),f(H(a)))\aeA}. 

Example 1. Consider the directed hypergraph depicted in Figure [l] Its vertices are 
u, v, w, x, y, t, and its hyperarcs a\ — ({u}, {v}), 02 = ({v}, {w }), 03 = ({w}, {«.}), 
04 = ({w, w}, {x, y}), and 05 = ({u>, y}, {t}). A hyperarc is represented as a bundle 
of arcs. It is decorated with a solid disk portion when its tail contain several 
vertices. 
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Figure 1. A directed hypergraph 

Applying the recursive definition of reachability from u discovers the vertex v, 
then w, which leads to the two vertices x and y through the hyperarc 04, and finally 
t through 05. It can be checked that t is reachable from u through the hyperpath 
01,02,04,05 (which is minimal). As mentioned in Section [T] some vertices play 
the role of "auxiliary" vertices when determining reachability. In our example, 
establishing that t is reachable from u first requires to establish that y is reachable 
from it, while y does not reach t. This is an important difference with directed 
graphs, in which proving that t is reachable from u would only involve vertices 
both reachable from u and reaching t. 

Observe that all the notions introduced in this section are generalizations of their 
analogues on directed graphs. Indeed, any digraph G = {V,A) can be equivalently 
seen as a directed hypergraph T~L = (V, {({«.}, {v}) \ (u,v) £ ^4})- Then the 
reachability relations on G and % coincide, and G and % both have the same size. 

3. Computing terminal strongly connected components 

In this section, we describe an algorithm which determines all terminal Sees of 
a directed hypergraph. Given a hypergraph % of vertices V, a component C is said 
to be terminal if for any u £ C and v £ V, u v implies v £ C. In other words, 
a Sec is terminal when it does not reach any component except itself. 

3.1. Principle of the algorithm. First observe that a directed graph graph('H) = 
{V,A') can be associated to any directed hypergraph % = (V 7 A), by defining 
A' = {(t, h) I ({i}, H) £ A and h £ H}. The directed graph graph("H) is generated 
by the simple hyperarcs of "H, i.e. the elements a £ A such that |T(o)| = 1. We first 
point out a remarkable special case in which the terminal Sees of H and graph("H) 
are equal: 

Proposition 1. Let % be a directed hypergraph such that each terminal Sec 0/ 
graph("H) is reduced to a singleton. Then H and graph('H) have the same terminal 
Sees. 

This statement is a consequence of the fact that any Sec of H is precisely the 
union of some Sees of graph('H): 

Lemma 2. Let % be a directed hypergraph. Each strongly connected component 
G of H is of the form UiC 4 ' where the G[ are the Sees of graph('H) such that 
CC\C[^$. 
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Proof. Consider u G C. Then there exists a component C" of graph("H) such that 
u G C (since the Sees of graph("H) form a partition of the set of the vertices), and 
obviously CnC"^0. 

Conversely, suppose that C is a Sec of graph (H) such that C n C ^ 0. Let 
u G C n C". Then for any u G C", we have w ~^ gra ph(«) v "^graph(K) u j so that 
u v -^fi u, hence v G C. □ 

Proof (Proposition^ . First suppose that {u} is a terminal Sec of graph("H). Sup- 
pose that there exists such that u v. Consider a hyperpath ai, . . . ,a p 
from w to v in Then there must be a hyperarc a, such that T{ai) = {u} and 
H(cii) {u} (otherwise, the hyperpath is a cycle and v — u). Let w G -ff(dj) \ {u}. 
Then (u,w) is an arc of graph('H). Since {u} is a terminal Sec of graph("H), this 
enforces w — u, which is a contradiction. Hence {u} is a terminal Sec of %. 

Conversely, consider a terminal Sec C of T~L. Let u G C, and let D be the 
Sec of graph('H) containing u. Consider D' a terminal Sec of graph("H) such that 
D -> g raph(«) D', and let C be a Sec of "H such that D' D C ^ 0. By Lemma [2] 
we have D' C C". It follows that C ~~>>i C", hence C = C" as C is terminal. Thus, 
D' C C, and since D' is a singleton, it also forms a Sec of H using the first part 
of the proof. This shows D' = C (since the Sees of T-L form a partition of the set 
of vertices), so that C is a terminal Sec of graph ('H). □ 

The following proposition ensures that, in a directed hypergraph, merging two 
vertices of a same Sec does not alter the reachability relation: 

Proposition 3. Let % = (V,A) be a directed hypergraph, and let x, y G V such 
that x =f{ y. Consider the function f mapping any vertex distinct from x and y to 
itself and both x and y to a same vertex z (with z ^ V \ {x, y} ). Then u -**u v if, 
and only if, f(u) f(v). 

Proof. Let W = f(H). Suppose that s t. Observe that if X,Y are subsets of 
V, f(X) C f(Y) as soon asicy, and f{X U7)C f{X) U f(Y). Therefore, if 
Oi, . . . , a p is a hyperpath from s to t, then: 

T(a t ) C {s} U H(ai) U • • ■ U H(a,i-i) for all 1 < i < p 

t G H(a p ) 

so that: 

f(T(ai)) C {/(«)} U f(H(ax)) U • • ■ U /(flfo-i)) for all 1 < i < p 
f(t) G /(fl-(ap)) 

It follows that f(s) ^f(H) /(*)• 

Conversely, suppose that /(t) is reachable from /(s) in T-L' , and that /(t) 7^ f(s) 
(the case /(f) = f(s) is trivial). Let f/o = {s} and T p+ i = {t}. 

By definition, there exist a\ = (Ti, -Hi), . . . , a p — (T p , H p ) in A such that for 
eadxtG{l,...,p + l},/(T i )C/(fl )U».U/(ir i _i). 

Also note that for any subset s of V, f(s) — s in s n {x, = and /(s) = 
s U {z} \ {x, y} otherwise. In particular, as soon as z ^ /(s), /(s) coincides with s. 
Besides, /(«) \ {z} C s C /(«) \ {z} U {*, y}. 

Two cases can be distinguished: 
(a) suppose that z does not belong to any f(Hj), so that f(Hj) = Hj. Similarly, 
for each i > 1, /(T,) does not contain z, hence /(Tj) = Tj. Besides, Tj C 
ffo U • ■ ■ U for each i, so that is is straightforward that f(s) /(*)• 
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(b) now, if z in one of the f(Hj), let k be the smallest integer such that z G f(Hk). 
Say for instance that x E iffc. Let (T[, H[), . . . , (T q , H') be taken from a 
hyperpath from x to y in "H. 

When i < fc, /(!}) does not contain z, hence f(T % ) = T t and T; C /(Ho) U 
-U/fffHl^oU-UifH. 

Besides, T{ = {x} C JJ U • • • U # fe , and for each i G {2, . . . , q}, T[ C 
i? U • • • U H k UH[ U • • • U E[_ x since an G fl*.. 

Finally, let us prove for i > k + 1 that T, C H Q U • • • U H k U U • • • U H' q U 

H k+1 U . . . Clearly, /(!}) \ {z} C U}=o(/(^) \ {*})• Besides, z e ^ fc 

and y G ff^, and since Tj is included into /(Tj) \ {z} U {x, y}, then Tj is also 
contained in i? U ■ ■ • U H k U ff( U • • • U H' q U H k+1 U • ■ ■ U 

It follows that (T l ,H l ) l=h ...^(Tl,H^ l=h ...^(T l ,H l ) l=k+1 .... tP forms a hy- 
perpath from s to t in "H. □ 

It follows that the terminal Sees of H and /(%) are in one-to-one correspon- 
dence. These properties can be straightforwardly extended to the operation of 
merging several vertices of a same Sec simultaneously. 

Using Propositions [l] and [3] we now sketch a method which computes the termi- 
nal Sees in a directed hypergraph % = (V, A). It performs several transformations 
on a hypergraph H cur whose vertices are labelled by subsets of V: 

Starting from the hypergraph H cur image of ~H by the map u i— > {u}, 

(i) compute the terminal Sees of the directed graph graph("H C ur)- 

(ii) if one of them, say C, is not reduced to a singleton, replace T-L C ur by 
/('Hcur), where / merges all the elements U of C into the vertex Uc/ec ^ ■ 
Then go back to Step 

(iii) otherwise, return the terminal Sees of the directed graph graph('H cnr ). 

Each time the vertex merging step (Step |ii|) is executed, new arcs may appear in 
the directed graph graph('H cur ). This case is illustrated in Figure [2] In both sides, 
the arcs of graph( - H cur ) are depicted in solid, and the non-simple arcs of "H cur in 
dotted line. Note that the vertices oil-L cur contain subsets of V, but enclosing braces 
are omitted for readability. Applying Step from vertex u (left side) discovers a 
terminal Sec formed by u, v, and w in the directed graph graph (T-L cur ). At Step |n]) 
(right side), the vertices are merged, and the hyperarc 04 is transformed into two 
graph arcs leaving the new vertex {u, v, w}. 

The termination of this method is ensured by the fact that the number of vertices 
in Wcur is strictly decreased each time Step ([i±J> is applied. When the method is 
terminated, terminal Sees of H cur are all reduced to single vertices, each of them 
labelled by subsets of V. Propositions [T] and [3] prove that these subsets are precisely 
the terminal Sees of W. 



3.2. Optimized algorithm. The sketch given in Section 3.1 is not optimal since 
a given vertex may be visited 0(|V|) times. To overcome this problem, we propose 
to incorporate the vertex merging step directly into an algorithm determining the 
terminal Sees in directed graphs. The resulting algorithm on directed hypergraphs 
is given in Figure [3] Note that we suppose that the directed hypergraph % is also 
provided with the lists A u of hyperarcs a such that u G T(a), for each u G v(^] 

2 These lists can be built in linear time in a preprocessing step. 
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Figure 2. A vertex merging step (the index of the visited vertices 
is given beside) 



The algorithm consists of a main function TerminalScc which initializes data, 
and then iteratively calls a visiting function Visit on the vertices which have not 
been visited yet. Following the sketch given in Section 3.1 the function Visit(u) 
repeats the following three tasks: (i) it recursively searches a terminal Sec in 
the underlying directed graph graph("H cur ), starting from the vertex u, (ii) once a 
terminal Sec is found, it performs a vertex merging step on it, (iii) and finally, it 
discovers the new graph arcs (if any) arising from the merging step. 

Before discussing each of these three operations, we explain how the directed 
hypergraph H cur is manipulated by the algorithm. First observe that the vertices 
of the hypergraph % cur always form a partition of the initial set V of vertices. 
Instead of referring to them as subsets of V, we use a union-find structure, which 
consists in three functions Find, Merge, and MakeSet (see for instance |CSRL0ll 
Chap. 21]): 

• a call to Find(u) returns, for each original vertex u £ V, the unique vertex of 
Hour containing u. 

• two vertices U and V of T-L cur can be merged by a call to Merge([/, V), which 
returns the new vertex. 

• the "singleton" vertices {u} of the initial % cur are created by the function 
MakeSet. 

With this structure, each vertex of Hcur is represented by an element u £ V, in 
which case it corresponds to the subset {v £ V | Find(u) = u}. Besides, the 
hypergraph Hcur is precisely the image of % by the function Find. 

To avoid confusion, we denote the vertices of the hypergraph % by lower case 
letters, and the vertices oil-L cur (and subsequently graph (% c ur)) by capital ones. 
By convention, if it G V, Find(u) will correspond to the associated capital letter 
U. Note that when an element u £ V has never been merged with another one, it 
satisfies Find(u) = u. 

Discovering terminal Sees in the directed graph graph('H cur ). This task is per- 
formed by the parts of the algorithm which are not shaded in gray. Similarly to 
Tarjan's algorithm |Tar72] . it uses a stack S and two arrays indexed by vertices, 
index and low. The stack S stores the vertices U of graph("H cnr ) which are currently 
visited by Visit. The array index tracks the order in which the vertices are visited, 
i.e. index[U] < index[V] if, and only if, U has been visited by Visit before V. The 
value low[U] is used to determine the minimal index of the visited vertices which 
are reachable from U in the digraph (see Line [44]) . A (not necessarily terminal) 
Sec C of graph('H cnr ) is discovered when a vertex U satisfies low[U] = index[U] 
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1: function TerminalScc( / H = (V, A)) 36 

2: n := 0, S := [ ], Finished :— 37 

3: for all a 6 A do 38 

4: r a : = undef, c a := 39 

5: done 40 

6: for all it e V do 41 

7: mdea;[?z] :— undef 42 

8: ioty[u] :— undef 43 

9: F u := [], Makeset(u) 44 

10: done 45 

11: for all u G V do 46 

12: if index[u] — undef then 47 

13: Visit(m) 48 

14: end 49 

15: done 50 

16: end 

51 

17: function Visit(m) 52 

18: local U := Find(u), local F := [] 53 

19: index[U] :— n, low[U] :— n 54 

20: n:=n+l 55 

21: is_term[U] :— true 56 

22: push U on the stack S 57 

23: for all a £ A„ do 58 

24: if \T(a)\ = 1 then push a on F 59 

25: else 60 

26: if r a — undef then r a :— u 61 

27: local R a ■— FlND(r a ) 62 

28: if R a appears in S then 63 

29: c Q := c tt + 1 64 

30: if c a = |T(o)| then 65 

31: push a on stack Fn a 66 

32: end 

33: end 

35- don" d auxiliary data update 



while F is not empty do 
pop a from F 
for all w £ H(a) do 
local W : = Find(iu) 
if index[W] — undef then Visit(u;) 
if W 6 Finished then 
is_term[U] :— false 
else 

low[U] := min(low[U], low[W]) 
isAerm\U\ :— is_terrn[U] is_term\W] 
end 
done 
done 

if Low[U] — index[U] then 
if is_terrn[U] — true then 
1> a terminal Sec is discovered 
local i :— index[U] 

pop each a from Fu and push it on F 

pop V from S 

while index[V] > i do 

pop each a from Fy and push it on F 

U := Merge(C/, V) 

pop V from S 
done 

index[U] :— i, push U on S 
if F is not empty then go to Line [36] 

end 

repeat 

pop V from S, add V to Finished 
until index[V] — index[U] 
end 
end 



step 



vertex 
merging 
step 



Figure 3. Computing the terminal Sees in directed hypergraphs 



(Line [49]). Then C consists of all the vertices stored in the stack S above U. The 
vertex U is the element of the Sec which has been visited first, and is called its 
root. Once the visit of the Sec is terminated, its vertices are collected into a set 
Finished (Line [63]). 

Additionally, the algorithm uses an array is-term of booleans, allowing to track 
whether a Sec of graph('H cur ) is terminal. A Sec will be terminal if, and only if, 
its root U satisfies is_term[U] = true. In particular, the boolean isAerm\U\ is set 
to false as soon as U is connected to a vertex W located in a distinct Sec (Line [42]) 
or satisfying is.term[W] — false (Line [45]). 

Vertex merging step. This step is performed from Lines [51] to [ioj when it is discov- 
ered that the vertex U — Find(u) is the root of a terminal Sec in the digraph 
graph(HcOT-)- All vertices V which have been collected in that Sec are merged to 
U (Linc [56|. Let H ne w be the resulting hypergraph. 

At Line|6o[ the stack F is expected to contain the new arcs of graph (% ne w) leaving 
the newly "big" vertex U (this point will be explained in the next paragraph). If 
it is empty, {U} is a terminal Sec of graph (H new ), hence also of Hnew (Prop. [TJ. 
Otherwise, we go back to the beginning of Line [36] to discover terminal Sees from 
the new vertex U in the digraph graph (% new ). 

Discovering the new graph arcs. In this paragraph, we explain informally how the 
new graph arcs arising after a vertex merging step (like in Figure [2]) are efficiently 
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discovered, i.e. without examining all the non-simple hyperarcs. The formal proof 
of this technique is provided in Appendix |B| 

During the execution of Visit (it), the local stack F is used to collect the hyper- 
arcs which represent arcs leaving the vertex Find(w) in graph (H C ur)- 

Initially, when Visit(m) is called, the vertex Find(w) is still equal to u. Then, 
the loop from Lines [23] to [35] iterates over the set A u of the hyperarcs a £ A such 
that u € T(a). At the end of the loop, it can be verified that F is indeed filled with 
all the simple hyperarcs leaving u = Find (it) in Hour, as expected. 

Now the main difficulty is to collect in F the arcs which are added to the digraph 
graph(H cnr .) after a vertex merging step. To overcome this problem, each non-simple 
hyperarc a € A is provided with two auxiliary data: 

• a vertex r a , called the root of the hyperarc a, and which is the first vertex of the 
tail T(a) to be visited by a call to Visit, 

• and a counter c a > 0, which determines the number of vertices x € T(a) which 
have been visited and such that FiND(a;) is reachable from FiND(r a ) in the current 
digraph graph("H cur ). 

These auxiliary data are maintained in the auxiliary data update step, from Lines [26] 
to [33J Initially, the root r a of any hyperarc a is set to the special value undef. 
The first time a vertex u such that a 6 A u is visited, it is assigned to u (see 
LineQ. Besides, at the call to Visit(m), the counter c a of each non-simple hyperarc 
a G A u is incremented, but only when R a = FiND(r a ) belongs to the stack S 
(see Line Q . This is indeed a necessary and sufficient condition to the fact that 
Find(u) is reachable from FlND(r a ) in the digraph graph('H cur ) (see Invariant [6] in 
Appendix |b|) . 

It follows from these invariants that, when the counter c a reaches the threshold 
value |T(a)|, all the vertices X = Find(x), for x £ T(a), are reachable from R a in 
the digraph graph('H cur ). Now suppose that, later, it is discovered that R a belongs 
to a terminal Sec C of graph("H cnr ). Then the aforementioned vertices X must all 
stand in the Sec C (since it is terminal). Therefore, when the vertex merging step 
is applied on this Sec, the vertices X are merged into a single vertex U. Hence, 
the hyperarc a necessarily generates new simple arcs leaving U in the new version 
of the digraph graph (H CU r)- 

Now let us verify that in this situation, a is correctly placed into F by our 
algorithm: as soon as c a reaches the threshold |T(a)|, a is placed into a temporary 
stack Fn a associated to the vertex R a (Line[3i]). It is then emptied into F at Lines [52] 
or [55] during the vertex merging step. 

Example 2. For example, in the left side of Figure[2j the execution of the loop from 
Lines [23] to [35] during the call to VlSIT(u) sets the root of the hyperarc 04 to the 
vertex v, and c a4 to 1. Then, during Visit(w), c a4 is incremented to 2 = |T(a4)|. 
The hyperarc 04 is therefore pushed on the stack F v (because R ai = FiND(r a4 ) = 
Find(u) = v). Once it is discovered that it, v, and w form a terminal Sec of 
graph("H cnr ), 04 is collected into F during the merging step. It then allows to visit 
the vertices x and y from the new vertex (rightmost hypergraph). A fully detailed 
execution trace is provided in Appendix [A] below. 

Correctness and complexity. For sake of simplicity, we have not included in Ter- 
MINAlScc the step returning the terminal Sees. However, they can be easily built 
by examining each vertex (hence in time 0(|V|)), as shown below: 
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Theorem 4. Let % = (V,A) be a directed hypergraph. After the execution of 
TerminalScc('H), the terminal Sees are precisely formed by the sets Cjj = {v *E 
V | Find(u) = U and is_term[U] = true}. 

The proof of Th. |1J which is too long to be included here, is provided in Ap- 
pendix |Bj It relies on successive transformations of intermediary algorithms to 
TerminalScc. 

The complexity of TerminalScc follows from the fact that we use disjoint-set 
forests with union by rank and path compression as union-find structure ( CSRL01, 
Chapter 21]). It allows to perform a sequence of p operations MakeSet, Find, 
or Merge in time 0(p x a(|V|)), where a is the very slowly growing inverse of 
the Ackermann function. For any practical value of x, a(x) < 4. That is why the 
complexity of TerminalScc is said to be almost linear in size('H): 

Theorem 5. Let % = (V, A) be a directed hypergraph. Then the algorithm Ter- 
MINAlScc('H) terminates in time 0(size(K) x a(|V|)). 

Proof. The analysis of the time complexity TerminalScc depends on the kind of 
the instructions. We distinguish: (i) the operations on the global stacks F u and 
on the local stacks F, (ii) the call to the functions Find, Merge, and Make- 
Set, (iii) and the other operations, referred to as usual operations (by extension, 
their time complexity will be referred to as usual complexity). Also note that the 
function Visit(u) is executed exactly once for each u £ V during the execution of 
TerminalScc. The complexity of each kind of operations is detailed thereafter: 

(i) each operation on the stack (pop or push) is in 0(1). A given hyperarc is 
pushed on a stack of the form F u at most once during the whole execution 
of TerminalScc. Once it is popped from it, it will never be pushed on a 
stack of the form Fy again. Similarly, a hyperarc is pushed on a local stack 
F at most once, and after it is popped from it, it will never be pushed on any 
local stack F' in the following states. Therefore, the total number of stack 
operations on the local and global stacks F and F u is bounded by 4|V|. It 
follows that the corresponding complexity is 0(|V|). 

Consequently, the total number of iterations of the loop from Lines |3s| to [47] 
occuring the whole execution of TerminalScc is bounded by J2aeA\^( a )\- 

(ii) during the execution of TerminalScc, the function Find is called: 

• exactly |V| times at Linefii) 

• at most X^uevl^fl = SaG^I^X )! tmies at Line [27I (since during the call to 
Visit(u), the loop from Line s [23] to [35] has exactly \A U \ iterations), 

• at most J2aeA\H( a )\ a * Line|39| (see above). 
Hence it is called at most size(?Z) times. 

The function Merge is always called to merge two distinct vertices. Let 
Ci, . . . , C p (p < |V|) be the equivalence classes formed by the elements of V at 
the end of the execution of TerminalScc. Then Merge has been called at 
most ELid^l ~ !)• Since EJCd = |V|, Merge is executed at most |V| - 1 
times. 

Finally, MakeSet is called exactly |V| times. It follows that the total time 
complexity of the operations MakeSet, Find and Merge is 0(size("H) x 
a(\V\). 

(iii) the analysis of the usual operations is split into several parts: 
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• the usual complexity TerminalScc without the calls to the function Visit 
is clearly 0{\V\ + \A\). 

• during the execution of Visit (it), the usual complexity of the block from 
Lines [is] to [35] is 0(1) + 0(|A U |). Indeed, we suppose that the test at Line [2s] 
can be performed in O(l) by assuming that the stack S is provided with an 
auxiliary array of booleans which determines, for each element of V, whether 
it is stored in Then the total usual complexity between Lines [is] and [35] 
is 0(size(%)) for a whole execution of TerminalScc. 

• the usual complexity of the body of loop from Lines [33] to [47J without the 
recursive calls to Visit, is clearly O(l). As mentioned above, the total num- 
ber of iterations of this loop is less than J2aeA\^-( a )\ — slze (H)- Therefore, 
the total usual complexity of the loop from Line s [36] to [4s] is in 0(size("H)). 

• the usual complexity of the loop between Lines |54| and [ss] for a whole exe- 
cution of TerminalScc is 0(|V|), since in total, it is iterated exactly the 
number of times the function Merge is called. 

• the usual complexity of the loop between Lines [62] and [m] for a whole execu- 
tion of TerminalScc is 0(|V|), because a given element is placed at most 
once into Finished. 

• if the two previous loops are not considered, less than 10 usual operations 
are executed in the block from Lines [49] to [eicij all of complexity 0(1). The 
execution of this block either follows a call to Visit or the execution of the 
goto statement (at Line[(io]). The latter is executed only if the stack F is 
not empty. Since each hyperarc can be pushed on a local stack F and then 
popped from it only once, it is executed \A\ in the worst case during the 
whole execution of TerminalScc. It follows that the usual complexity of 
the block from Lines [49] to [e6] is 0(|V| + |^4|) in total (excluding the loops 
previously discussed). 

Summing all the complexities above proves that the time complexity of Termi- 
nalScc is 0(size(K) x a(|V|). □ 

The space complexity of the algorithm TerminalScc is obviously linear in 
size("H). An implementation is provided in the library TP Lib [A1109bj (module 
Hypergraph), where the algorithm is used to efficiently characterize extreme points 
in tropical polyhedra |AGG10j . It can be used independently of the rest of the 
library]^] 

3.3. Determining some other properties in almost linear time. Some prop- 
erties can be directly determined from the terminal Sees. Indeed, a directed hy- 
pergraph % admits a sink (i.e. a vertex reachable from all vertices) if, and only if, 
there it contains a unique terminal Sec. Besides, it is strongly connected when all 
vertices are contained in this latter component. 

Corollary 6. Given a directed hypergraph %, the following problems can be solved 
in almost linear time in size('H): (i) is there a sink in %? (ii) is % strongly 
connected? 



'Obviously, the push and pop operations on the stack S are still in O(l) under this assumption. 
Note that in the source code, terminal Sees are referred to as maximal Sees. 
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a'[Si] a'[S 2 ] 



FIGURE 4. The directed hypergraph 'H(J r , with D = 
{x\, . . . , X4} and T consisting of S\ = {x\, X2, X4}, S 2 — 
{xi,x 2 ,x 3 }, and S 3 = {x 1 ,x 2 }- 

4. Combinatorial complexity of the reachability relation in 
directed hypergraphs 

44. A lower bound on the size of the transitive reduction. Given a di- 
rected graph or a directed hypergraph, the reachability relation can be represented 
by the set of the couples (x, y) such that x reaches y. This is however a particularly 
redundant representation because of transitivity. Besides, in order to get a better 
idea of the intrinsic complexity of the reachability relation, we are rather interested 
in more economical representations. In fact, the reachability relation admits tran- 
sitive reductions, which are defined as minimal binary relations having the same 
transitive closure. 

In directed graphs, Aho et al. have shown in |AGU72| that all transitive reduc- 
tions of the reachability relation have the same size (the size of a binary relation 
1Z is the number of couples (x, y) such that x 1Z y). This size is bounded by the 
size of the graph. Furthermore, a canonical transitive reduction can be defined, by 
choosing a total ordering over the vertices. 

In directed hypergraphs, the existence of a canonical transitive reduction of 
the reachability relation can be similarly established, because reachability is still 
reflexive and transitive)^] However, we are going to show that its size may be 
supeiiincar in size("H) for some directed hypergraphs H. 

These hypergraphs arise from the subset partial order. More specifically, given 
a family J- of distinct sets over a finite domain D, the partial order induced by 
the relation C on T is called the subset partial order over J 7 . From this family, we 
build a corresponding directed hypergraph H(J-,D). Each of its vertices is either 
associated to a set S G J- or to a domain element x £ D, and is denoted by v[S] or 
v[x] respectively. Besides, each set S is associated to two hyperarcs a[S] and a'[S]. 
The hyperarc a[S] leaves the singleton {t^S 1 ]} and enters the set of the vertices v[x] 
such that x £ S. The hyperarc a'[S] is defined inversely, leaving the latter set and 
entering {w[S']}. An example is given in Figure [4j 



Any finite reflexive and transitive relation 1Z can be seen as the reachability relation of a 
directed graph G, whose arcs are the couples (x, y) such that xTZy. Then the transitive reduction 
of 11 is defined as in |AGU72| . 
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Lemma 7. Given S G T, v is reachable from v[S] in T-L^F^D) if, and only if, 
v = v[S'] for some <S" E J- such that S' C S, or v = v[x] for some i£S. 

Proof. Clearly, any vertex v[x] is reachable from v[S] through the hyperarc a[S]. 
Besides, assuming S 3 S", then v[S] reaches v[S'] through the hyperpath formed 
by the hyperarcs a[S] and a' [S']. 

Now, let us prove by induction that these are the only vertices reachable from 
v[S]. Let u be reachable from v[S]. If u = v[S], then this is obvious. Otherwise, 
there exists a hyperarc a — (T, H) such that u E H and T — {ui, . . . , u q } with each 
Ui being reachable from u. We can distinguish two cases: 

(i) either a is of the form a[S'] for some S' E J-, in which case the tail is reduced 
to the vertex w[S"], which is reachable from v[S]. By induction, we know that 
S ~D S' . Since u = v[x] for some x E S', it follows that x G S. 

(ii) or a is of the form a'[S'] for some S' E T . Then its tail is the set of the v[x] 
for x E S' , and its head consists of the single vertex v[S']. Thus x E S for all 
x E S' by induction, which ensures that u — v[S'] with S' C S. □ 

Up to adding an extra element to the domain D and to each set S E T , it can 
be assumed that \S\ > 1 for all S. In this case, the directed hypergraph H(J-,D) 
can be shown to be acyclic: 

Lemma 8. Let J- be a family of distinct sets over D. Assuming that \S\ > 1 for 
all S E T , the directed hypergraph 'H(J r , D) is acyclic. 

Proof. Suppose that 'H(J r , D) contains a non-trivial cycle. If this cycle contains two 
distinct vertices v[S] and v[S'], then by Lemma [7j we should have S = S', which 
contradicts the distinctness assumption over the sets of T. Thus, the cycle should 
contain at least a vertex v[x] for some x E D. However, since |5| > 1 for all S 
containing x, v[x] does not reach any vertices except itself. Therefore, v[x] cannot 
belong to any (non-trivial) cycle, which provides a contradiction. □ 

Then the following proposition holds: 

Proposition 9. The size of the transitive reduction of the reachability relation of 
'H(J r , D) is lower bounded by the size of the transitive reduction of the subset partial 
order over the family J-. 

Proof. We are going to show by contrapositive that for any couple (5, S') in the 
transitive reduction of the subset partial order over the family J 7 , (u[5'], v[S]) must 
belong to the transitive reduction of the relation -^uiF.D)- 

Suppose that the pair (v [S'], v[S]) is not in transitive reduction of -^-h^.d)- 
If S % S', then naturally, (5, S') does not belong to the transitive reduction of 
the subset partial order over T. Now, let us assume S C S' . Then there exists a 
sequence u\, . . . ,u p of p vertices ofH^T, D) (p > 2) such that U\ = v[S'], u p — v[S], 
and Mi -^"^(jF d) • ■ • ~* > -hCf,d) u p- Observe that any vertex reaching a vertex of the 
form v[T] (T G J-) is necessarily of the form v[T'] for some T' G J- (because of 
the assumption \T\ > 1 which ensures that no vertex of the form v[x] for x E D 
can reach v[T]). Consequently, there exists Si,...,S p E F such that Ui = v[Si] for 
all 1 < i < p. Following Lemma [7j this shows that Si D ■ ■ ■ D S p . Since p > 2, 
(S,S') = (S p ,Si) cannot belong to the transitive reduction of the subset partial 
order over T . □ 
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The subset partial order have been well studied in the literature |YJ93| IPri95| 
IPri99al IPri99bl IElm09j . It has been proved in [YJ931 IElm09j that the size of the 
transitive reduction of the subset partial order can be superlinear in the size of the 
input (J-,D) (defined as \D\ + X^se^l^D- Combining this with Prop. [9] provides 
the following result: 

Theorem 10. There is a directed hypergraph H, such that the size of the transitive 
reduction of the reachability relation is in 

Proof. We use the construction given in [Elm09 in which T consists of two disjoint 
families T\ and Ti of sets over the domain D — {x\, . . . , x n } (where n is supposed 
to be divisible by 4). The first family is formed by the subsets containing all 
the elements x%, . . . , x n /2, and precisely n/4 elements among x n /2+i, ■ ■ • , x n . The 
second family consists of the subsets having n/4 elements among xi, . . . , x n /2- 

Clearly, the transitive reduction of the subset partial order over T coincides with 
the cartesian product J 2 x -? r i- Each Ti precisely contains (™^) = 0(2™/ 2 / \/n) sets, 
so that the size of the transitive reduction of the subset partial order is 8(2™/n). 

Proposition [9] shows that the size of the transitive reduction of the reachability 
relation -^uiT^) is in f2(2"/n). Now, the size of the directed hypergraph W(J-, D) 
is equal to: 

s,ze(^, D)) = n + 2 + 2- + 2- , 

so that size('H(J r , D)) = 9(v / ^2™/ 2 ). This provides the expected result. □ 

Theorem [lO] highlights an important difference between directed graphs and hy- 
pergraphs. Unlike graphs, hypergraphs do not admit any economical representation 
of the reachability relation having a size in 0(size("H)). As a consequence, the reach- 
ability relation embedded in directed hypergraphs is combinatorially more complex 
than in directed graphs. 

4.2. Reachability problems in directed hypergraphs and combinatorial 
problems on sets. The lower bound provided by Theorem[TO|suggests that solving 
some reachability related problems in directed hypergraphs may be not as easy as 
in digraphs. This is confirmed by the results of this section, in which we exhibit 
linear time reductions of problems on the subset partial order to such reachability 
problems. 

Topological sort and linear extension. The topological sort of an acyclic directed 
hypergraph T-L refers to a total ordering < of the vertices such that u < v as soon 
as u -^-u v. Using the hypergraphs H^J 7 , D) built from families of sets introduced 



in Section 4.1 we can establish the following result: 



Proposition 11. There is a linear time reduction from the problem of determining 
a linear extension of the subset partial order over a family of sets, to the problem 
of topologically sorting the vertices of an acyclic directed hypergraph. 

Proposition \l 1\ Consider a family J- of sets over a domain D. The directed hyper- 
graph T-L(J-, D) can be built in linear time in the size of (J 7 , D) {i.e. \D\+^ SeJr \S\). 
Suppose that we now have a topological ordering < over the vertices. Without loss 
of generality, it can be supposed that it is given by a real-valued function / such 
that u ■< v if, and only if, f{u) < f{v). By Lemma[7j for any two sets S, S" e T such 
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that S' C S, we have f(v[S}) < f(v[S']). It follows that the function g : T -> K 
defined by g(S) = — f(v[S]) yields a linear extension of the partial order over J- . □ 

To our knowledge, the problem of determining a linear extension of the subset 
partial order over a family T of sets has not been particularly studied. It is probably 
not obvious to solve this problem without examining a significant part of the subset 
partial order (or at least of a sparse representation such as its transitive reduction). 
The best known methods to compute the subset partial order have a complexity in 
0(N 2 / log 2 N) in the dense case |Elm09| . and 0(N 2 / log N) in general (e.g. [Pri95j ). 
where N is the size of the input. In comparison, topologically sorting directed 
graphs can be solved in linear time. 

Strongly connected components and the minimal set problem. Given a family T 
of distinct sets as above, the minimal set problem consists in finding the mini- 
mal sets S S T for the subset partial order C. This problem has received much 
attention |Pri911 IYel921 IYJ931 IPri951 IPri99bl IElm091 IBTTT] . It has important ap- 
plications in propositional logic [Pri9T] or data mining [BPllj . It can also be seen 
as a boolean case of the problem of finding maximal vectors among a given fam- 
ily |KLP75llKS8Rll(^S(;n5j . 

We establish a linear time reduction from the minimal set problem to the problem 
of determining the strongly connected components in a directed hypergraph. Given 
a family T of sets over the domain D, we build a directed hypergraph H(J-,D) 
starting from the hypergraph H(J~, D). On top of the vertices of the latter, H(J-, D) 
has the following vertices: (i) for each 5* e T , an additional vertex w[S], (ii) (\D\ + 
1) vertices labelled by Co, . . . , cmi, (iii) and a special vertex labelled by superset. 
Besides, we add the following hyperarcs: (i) for each S G J-, a hyperarc leaving 
{i>[<S1} and entering the singleton {c|s|_i}, (ii) for every < i < \D\, a hyperarc 
leaving {a} and entering the set of the vertices w[S] such that i = \S\, (iii) for 
each i > 0, a hyperarc from {d} to {ci_i}, (iv) for each Sg J, a hyperarc leaving 
the set {f[5*], u^S*]} and entering the singleton {superset}, (v) for every S € J-, a 
hyperarc from {superset} to {vfS']}. This construction is illustrated in Figure[5j 

Every vertex v[S] is reachable from superset. Conversely, it can be shown that 
v[S] reaches superset if, and only if, it is not minimal, meaning that there exists 
S' € T such that S D S': 

Proposition 12. For any S € T , S is not minimal in T if, and only if, superset 
is reachable from v[S] in %(J : ,D). 

Proof. Assume that S is not minimal in J 7 , and let S' E J- satisfying S' C S. 
Then by Lemma [TJ v[S'] is reachable from v[S] in %(J-, D), and hence in %(J-, D). 
Besides, since |S"| = j < \S\ = i, then w[S'] is reachable from v[S] through the 
hyperpath traversing the vertices Cj , cy+x ) • • • f c% ■ Finally, the vertex superset is 
reachable through the hyperarc from {f[5"], 

Conversely, suppose that v[S] reaches superset in H(J-,D). Consider a minimal 
hyperpath a±,...,a p from v[S] to superset. Necessarily, a p is a hyperarc of the 
form ({w[S"], iu[S"]}, {superset}) for some S' G T . Consequently, both vertices v[S'] 
and w[S'] are reachable from v[S]. Besides, to each of the two vertices, there exists 
a hyperpath from v[S] which does not contain the vertex superset (meaning that 
the latter does not appear in any tail or head of the hyperarcs of the hyperpath). 
These two hyperpaths are subsequences of ai, . . . , a p . 
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Figure 5. The hypergraph H(T, D), where D = {xi,...,Xi} 
and T consists of S\ — {x\, X2, #4}, S2 = {x%, x%, X3}, and 
S3 = {xi,%2\- The hyperarcs of 'H{F, D) are depicted in gray. 



Thus, suppose that a' 1; ...,a^ is a minimal hyperpath from v[S] to v[S] not 
containing superset. In this case, no vertex of the form w[T] for T £ J- appears 
in the hyperpath, since otherwise, the vertex superset should also appear (the only 
hyperarc from w[T] enters superset), or the hyperpath would not be minimal (we 
could remove the hyperarc leading to {u>[T]}). Similarly, no vertex of the form Cj 
belongs to the hyperpath, since otherwise, it should also contain a vertex of the 
form w[T] (or the hyperpath would not be minimal). It follows that the hyperpath 
a[, . . . ,a' q is also a hyperpath in the hypergraph l-KT , D). Applying Lemma [7] then 
shows that S' C S. 

It remains to show that the latter inclusion is strict. Similarly, let a",..., a" be 
a minimal hyperpath from v[S] to w[S'] not containing superset. Then the tail of 
a" is necessarily reduced to the vertex Cj, where i — \S'\, and its head is 
It follows that hyperpath from v[S] to c% not containing superset. 

Now suppose that i > Let j > i the greatest integer such that Cj appears in 
the hyperpath a", . . . , a"_ x . Necessarily, one of the hyperarc in the hyperpath is 
of the form ({v[T]}, {<:.,■}), so that u[T] is reachable from v[S] through a hyperpath 
not passing through the vertex superset. It follows from the previous discussion 
that T C S. But |T| = j + 1 > i, which is a contradiction. This shows that 
i = \S'\ < \S\, hence S' C S. □ 

As a consequence, minimal sets of the family T are precisely given by the vertices 
of the form v[S] which do not belong to the Sec of the vertex superset. This proves 
the following complexity reduction: 
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Theorem 13. The minimal set problem can be reduced in linear time to the problem 
of determining the Sees in a directed hypergraph. 

Proof. We assume the existence of an oracle providing the Sees of any directed 
hypergraph. 

Consider an instance (J-, D) of the minimal set problem. The hypergraph 
H(J-,D) can be built in linear time in the size of the input. Calling the oracle 
on "H(.F, D) yields its Sees. Then, by examining each Sec and its content, we 
collect the S E J- such that v[S] does not belong to the same component as the 
vertex superset. We finally return these sets. By Proposition |12[ they are precisely 
the minimal sets in the family J- . □ 

No algorithm is known to solve the minimal set problem in linear time. Sur- 
prisingly, the most efficient algorithms addressing the problem compute the whole 
subset partial order (YJ93, Elm09] . so that the best known time complexity is in 
0(iV 2 /log fe 7V) (fc = 1 or 2). 

Remark 3. Another interesting combinatorial problem is to decide whether a col- 
lection of sets is a Sperner family, i.e. the sets are not pairwise comparable. As a 
consequence of Theorem |13[ it can be shown that the problem of deciding whether 
a collection of sets is a Sperner family can be reduced in linear time to the problem 
of determining the Sees in a directed hypergraph. The Sperner family problem can 
be indeed reduced in linear time to the minimal set problem, by examining whether 
the number of minimal sets of T is equal to the cardinality of T . 



5. Conclusion 

In this paper, we have studied several aspects relative to reachability and strongly 
connected components in directed hypergraphs. We have defined an algorithm 
which allows to determine all terminal Sees in almost linear time. In comparison, 
the previous approaches run in quadratic time. As a consequence, two other im- 
portant problems, testing strong connectivity and the existence of a sink, can be 
solved in almost linear time. 

We have also shown that the reachability relation in directed hypergraphs is 
more complex than in directed graphs, by proving a superlinear lower bound on 



the size of its transitive reduction (Th. 10 1. We have defined linear time reductions 



from combinatorial problems on set families to reachability problems in directed 
hypergraphs, in particular from the minimal set problem to the problem of deter- 
mining the Sees of a directed hypergraph (Th.[l3|. This strongly suggests that the 
latter may be not solvable in linear time as in directed graphs. These reductions 
also strengthen the interest for finding efficient algorithms to determine all Sees in 
directed hypergraphs. 

For future work, we consequently plan to study how to generalize the algorithm 
introduced in Section [3] to find all Sees, hopefully improving the existing com- 
plexity bounds on the minimal set problem. In parallel, it would be interesting to 
study complexity lower bounds (most likely superlinear ones) on the problem of 
computing the strongly connected components. We think that the reduction from 
combinatorial problems on sets could be helpful to derive such bounds. 
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Appendix A. An Example of Complete Execution Trace of the 

Algorithm of Section [3] 

We give the main steps of the execution of the Algorithm TerminalScc on the 
directed hypergraph depicted in Figure [T] 
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Vertices are depicted by solid circles if their index is denned, and by dashed circles 
otherwise. Once a vertex is placed into Finished, it is depicted in gray. Similarly, a 
hyperarc which has never been placed into a local stack F is represented by dotted 
lines. Once it is pushed into F, it becomes solid, and when it is popped from F, it 
is colored in gray (note that for the sake of readability, gray hyperarcs mapped to 
cycles after a vertex merging step will be removed) . The stack F which is mentioned 
always corresponds to the stack local to the last non-terminated call of the function 
Visit. 

Initially, Find(z) = z for all z 6 {u, v, w, x, y, t}. We suppose that Visit(u) is 
called first. After the execution of the block from Lines [is] to [35] the current state 
is: 



index [u] = 
low[u] =0 in 
is-term[u] = true 




r • 



> v 1 

V _ > 



S=[u] 
n = 1 
F = M 



Following the hyperarc a\, VlSIT('y) is called during the execution of the block from 
Lines [36] to [4s] of Visit(u). After Line [35] in Visit(w), the root of the hyperarc a 4 is 
set to v, and the counter c a4 is incremented to 1 since v € S. The state is: 



index[v] — 1 
low[v] — 1 
is-term[v] — true 



index[u] = 
low[u] = 
isJ,erm\u] = true 



0. 



I w ) 



= 1 



■ y i 

V _ y 



S = [u; u] 
n = 2 
F = [02] 



Similarly, the function ViSiT(iy) is called during the execution of the loop from 
Lines [36] to [4s] in Visit(w). After Line [35] in Visit(w), the root of the hyperarc 05 
is set to w, and the counter c Q5 is incremented to 1 since w G S. Besides, c ai 
is incremented to 2 = |T(ei4)| since FiND(r a4 ) = Find(w) = v e S, so that 04 is 
pushed on the stack F v . The state is: 



STRONGLY CONNECTED COMPONENTS OF DIRECTED HYPERGRAPHS 



21 



index[v] = 1 
£ow[t>] = 1 
is-term[v] = true 



index[u] 
low[u] 
is_term\u] 




index[w 
low[w] 
is-term[w] 



2 
2 

true 



r ai = v 
c«. =2 



> y ) 

y _ * 

t5 =w 



s = 


[w; v; u 


n = 


3 


F = 




F, = 


04 



The execution of the loop from Lines [3(3] to [4s] of Visit(w) discovers that index[u] 
is defined but it $jL Finished, so that (ow[w] is set to min(Zow[W], low[u]) = and 
is-term[w] to isjterm[w\ &fc is_term[u] = trite. At the end of the loop, the state is 
therefore: 

vndex[v] = 1 
Zow[w] = 1 
is-term[v] = true 



index [u] 
low[u] 
is_term\u] 






true 







index[w] 
low[w] 
is-term[w] 



Q 


: 2 



true 



= v 
= 2 



■ 3/ 



S = [io; «; it] 
n = 3 
F = [] 
F, = [04] 



= 10 

= 1 



Since low[w] ^ index[w], the block from Line s [49] to [66] is not executed, and VlSIT(tu) 
terminates. Back to the loop from Lines [36]to|48|in Visit(zj), low[v] is assigned to the 
value min(/ow[w], low[w]) = 0, and isJ,erm[v\ to is-term[v] &fc isJ,erm[w] = true: 

index[v] = 1 
Zowhv] = 



index[u] 
low[u] 
is_term\u] 



is. 


term[v] = 


= true 





© 


r ai = v 

. Ca 4 = 2 




true 







index[w 


= 2 


! 7 


low[w 


= 


t 


is-term[w 


= true 


-v. 



■ y 



= w 

= 1 



S = [iu; v; u] 
n = 3 
F = [] 
Fu = [04] 



Since Zow[y] ^ index[v], the block from Lines [49] t o [66| is not executed, and Visit(w) 
terminates. Back to the loop from Lines [36] to |4s| in Visit(u), low[u] is assigned to 
the value mm(low[u], low[v]) = 0, and isJ,erm[u] to is _term[u]Mzis _term[v] = true. 
Therefore, at Line [49) the conditions low[u] = index[u] and is-term[u\ — true hold, 
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so that a vertex merging step is executed. At that point, the stack F is empty. 
After that, i is set to index[u] = (Line [51]), and F u = [] is emptied to F (Line [52]), 
so that F is still empty. Then w is popped from S, and since index[w] — 2 > i = 0, 
the loop from Lines [54] to [si] is iterated. Then the stack F w = [] is emptied in F. 
At Line [scij Merge(m, w) is called. The result is denoted by U (in practice, either 
U = u or U = w). The state is: 

mdex[v] = 1 
2ow[u] = 
is-term\v\ = true 



Q 

: 

©;. 

vndex[U] = or 2 
low[U] = 
is-term[U] = true 



— v 
= 2 



■ V I 



= 

= 1 



S = [w; u] 
n = 3 
F„ = [04] 
i = 
F=[] 

U = Find(u) 



Find(w) 



Then v is popped from S, and since index[v] = 1 > i = 0, the loop Lines [54] to [si] is 
iterated again. Then the stack F v = [04] is emptied in F. At Line[56j Merge(J7, v) 
is called. The result is set to U (in practice, U is one of the vertices u, v, w). The 
state is: 



index[U] = 0, 1, or 2 
low[U] = 
is_ierm[(7] = true 




J, 



= u> 
= 1 



5=M 
n = 3 

^ = [] 
i = 

F = [04] 

(7 = Find(u) = Find(w) 
= Find(w) 



After that, u is popped from S, and as index[u] = = i, the loop is terminated. 
At Line[59j index[U] is set to i, and U is pushed on S. Since F 7^ 0, we go back to 
Lineljieil in the state: 



mdex[U] = 
Joio[J7] = 
is-term[U] = trite 




= 1 



1 1 , 



n = 3 
F = [04] 
U = Find(m) 
= Find(w) 



Find(v) 



Then 04 is popped from F, and the loop from [3s] to [47] iterates over (04) = {x, y}. 
Suppose that x is treated first. Then Visit(x) is called. During its execution, at 
Linelail the state is: 
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index[U] = 
low[U] = 
is-term[U] = true 



index[x] — 3 
low[x] — 3 
is-term[x] — true 




r a5 = w 

Ca*. = 1 



t : 



S = [a;; U] 
n = 4 

f =[] 

[/ = Find(u) = Find(w) 
= Find(w) 



Since F is empty, the loop from Lines [36] to [is] is not executed. At Line|49j low[x] — 
index[x] and isAerm[x\ = true, so that a trivial vertex merging step is performed, 
only on x, since it is the top element of S. At Line [59J it can be verified that 
S = [x;U], index[x] = 3 and F = []. Therefore, the goto statement at Line [6o| is 
not executed. It follows that the loop from Lines [62] to [ii] is executed, and after 
that, the state is: 

mdex[x] — 3 
low[x] = 3 
isAerm[x\ — true 



index[U] = 
low[U] = 
is-term[U] — true 




w 
1 



S=[U] 
n = 4 

U = Find(u) = 
= Find(w;) 
Finished = {x} 



Find(w) 



t 1 



After the termination of Visit(x), since x € Finished, isAerm\U\ is set to false. 
After that, ViSiT(y) is called, and at Line [35J it can be checked that c a5 has been 
incremented to 2 = |T(a 5 )| because i? as = FiND(r a5 ) = Find(w) = U and U G S. 
Therefore, as is pushed to Fjj, and the state is: 

index[x] = 3 
low[x] = 3 
is-term[x] = true 



index[U] = 
low[U] = 
is-term[U] = false 




index [y] 
low[y] 
is-term[y] 



= 4 
= 4 
= true 



= w 
= 2 



5 = [y; U] 
n = 5 

^ =[] 
Ft/ = [as] 
[/ = Find(u) 
= Find(v) 
= Find(iu) 
Finished = {x} 
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As for the vertex x, Visit (y) terminates by popping y from S and adding it to 
Finished. Back to the execution of Visit(/7), at Line[49j the state is: 



index[U] = 
low[U] = 
is-term[U] = false 



index[x] = 3 
low[x] = 3 
i_term[x] = true 




index [y] 
low[y] 
is _term[y] 



Ta 5 
Ca 5 



= w 
= 2 



: 4 
: 4 

: true 



S 

n ■■ 
F 
Fu 
U 



[U] 
5 



[a 5 j 

Find(m) 
= Find(v) 
= Find(iu) 
Finished = {y, x} 



While low[U] = index[U], is-term[U] is equal to false, so that no vertex merging 
loop is performed on U. Therefore, a$ is not popped from Fjj. Nevertheless, the 
loop from Lines [62] to [64] is executed, and after that, Visit(m) is terminated in the 
state: 



index[U] = 
low[U] = 
is-term[U] = false 



index[x] = 3 
low[x] = 3 
_term\x\ = true 




index [y] 
low[y] 
is _term[y] 



= 4 
= 4 
= true 



r a5 



= w 
= 2 



S=[] 
n = 5 

F=[] 
Fu = [as] 
U = Find(u) 
= Find(v) 
= Find(w) 
Finished = {U,y, x} 



Finally, ViSiT(t) is called from TerminalScc at Line [13J It can be verified that 
a trivial vertex merging loop is performed on t only. After that, t is placed into 
Finished. Therefore, the final state of TerminalScc is: 
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[] 

6 

[as] 

Find(m) 
Find(h) 
Find(w) 
{t, U, y, x} 



index[t] = 5 
low[t] = 5 
is-term[t] = true 

As is_term[x] = is_term[y] = is-term[t] = true and is-term[FiNB(z)] = false for 
z — u,v, w, there are three terminal Sees, given by the sets: 

{z | Find(z) = x} = {x}, 

{z | Find(z) = y} = {y}, 

{z | Find(z) =t} = {*}. 

Appendix B. Proof of Theorem |4] 

The correctness proof of the algorithm TerminalScc turns out to be harder 
than for algorithms on directed graphs such as Tarjan's one |Tar72j . due to the 
complexity of the invariants which arise in the former algorithm. That is why we 
propose to show the correctness of two intermediary algorithms, named Termi- 
NALSCC2 (Figure |6| and TerminalScc3 (Figure [7|, and then to prove that they 
are equivalent to TerminalScc. 

The main difference between the first intermediary form and TerminalScc is 
that it does not use auxiliary data associated to the hyperarcs to determine which 
ones are added to the digraph graph("H cur ) after a vertex merging step. Instead, 
the stack F is directly filled with the right hyperarcs (Lines [22] and [49]) . Besides, 
a boolean no -merge is used to determine whether a vertex merging step has been 
executed. The notion of vertex merging step is refined: it now refers to the execution 
of the instructions between Lines [41] and [50] in which the boolean no -merge is set to 
false. 

For the sake of simplicity, we will suppose that sequences of assignment or stack 
manipulations are executed atomically. For instance, the sequences of instructions 
located in the blocks from Lines [16] and [25J or from Lines [41] and [50J and at from 
Lines [56] to [ssj are considered as elementary instructions. Under this assumption, 
intermediate complex invariants do not have to be considered. 

We first begin with very simple invariants: 

Invariant 1. Let U be a vertex of the current hypergraph W cur . Then index[U] is 
defined if, and only if, index[u] is defined for all u € V such that Find(u) = U. 

Proof. It can be shown by induction on the number of vertex merging steps which 
has been performed on U. 



index[U] = 
low[U] = 
is-term[U] = false 



index[x] = 3 
low[x] = 3 
is-term[x] = true 




index[y] = 4 
low[y] — 4 
is-term[y] = true 



= w 
= 2 



S 

n ■■ 
Fu 
U 



Finished 
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1 


function TerminalSoc2(V, A) 




2G 


while F is not empty do 


2 


n := 0, S := [], Finished := 




27 


pop a from F 


3 


for 3.11 a £ j\. do collected ' — 


false 


28 


for all w £ H (cb) do 


4 


for all u £ yl do 




20 


local W :— Find(k;) 


5 


index \u\ ' — undef 




30 


if mdea;[VV] — undef then Visit2(il?) 




lotu \_u~^ ' undef 




31 


if W £ Finished then 


7 


IVI A K E S E T ( ti ) 




32 


is_t erm [t/] : — false 


g 






33 




g 


Tor an ii t k qo 




34 


low[U] := min(low[U], low[W]) 


10 


if index[u] — undef then 




35 


is_term[U] :— is_term[U] && is_tenn\W\ 


11 


Visit2(u) 




36 


end 


12 


end 




37 


done 


13 


done 




38 


done 


14 


end 




39 
40 
41 
42 


if low[U] — index[U] then 
if is_term[U] — true then 
local i :— index [U] 
pop V from S 


15 


function Visrr2(u) 




43 


while index[V] > i do 


16 


local U :— FlND(ii), local F : — 




44 


no-merge :— false 


17 


index[U] :— n, low[U] :— n 




45 


U := Merge((7, V) 


18 


n :— n + 1 




46 


pop V from S 


19 


is_term[U] :— true 




47 


done 


20 


push U on the stack S 




48 


push U on 5 


21 
22 


local nojmerge :— true 
F := {a e A | T(a) = 




49 


f 1 collected a — false, "1 
F ;= |o e A | e T(a) s FiND(a;) - 17 J 


23 


for all a £ F do 




50 


for all a £ F do collected a :— irite 


21 


collected a '■— true 




51 


if nojmerge — false then 


25 


done 




52 
53 
54 
55 
56 
57 
58 
59 
60 


n : — i, index[U] :— n, n :— n + 1 
nojmerge :— true, go to Line [26] 
end 
end 
repeat 

pop V from 5, add V to Finished 
until index[V] — index[U] 
end 
end 



Figure 6. First intermediary form of our almost linear algorithm 
on hypergraphs 



In the basis case, there is a unique element u € V such that Find(u) = U. 
Besides, U — u, so that the statement is trivial. 

After a merging step yielding the vertex U, we necessarily have index[U] =/= 
undef. Moreover, all the vertices V which has been merged into U satisfied 
index[V] ^ undef because they were stored in the stack S. Applying the induction 
hypothesis terminates the proof. □ 

Invariant 2. Let h6V. When index[u] is defined, then Find(w) belongs either to 
the stack S, or to the set Finished (both cases cannot happen simultaneously) . 

Proof. Initially, Find(u) = u, and once index[u] is defined, FlND(lt) is pushed on 
S (Line [20]) . Naturally, u ^ Finished, because otherwise, index[u] would have been 
defined before (see the condition Line [5s]). After that, U = Find(m) can be popped 
from S at three possible locations: 

• at Lines [42] or in which case U is transformed into a vertex U' which is 
immediately pushed on the stack S at Line|48j Since after that, Find(u) = U', 
the property Find(u) g S still holds. 

• at Line[57j in which case it is directly appended to the set Finished. □ 

Invariant 3. The set Finished is always growing. 
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Proof. Once an element is added to Finished, it is never removed from it nor merged 
into another vertex (the function Merge is always called on elements immediately 
popped from the stack S). □ 

Proposition 14. After the algorithm TerminalScc2("H) terminates, the sets {v g 
V | Find(u) = U and isjterm[U] = true} are precisely the terminal Sees ofH. 

Proof. We prove the whole statement by induction on the number of vertex merging 
steps. 

Basis Case. First, suppose that the hypergraph W is such that no vertices are 
merged during the execution of TerminalScc2(H), i.e. the vertex merging loop 
(from Lines [43] to |47| is never executed. Then the boolean no -merge is always set 
to true, so that n is never redefined to i + 1 (Line [52]) , and there is no back edge 
to Line [213] in the control-flow graph. It follows that removing all the lines between 
Lines|4i]to[53]does not change the behavior of the algorithm. Besides, since the func- 
tion Merge is never called, Find(u) always coincides with u. Finally, at Line[22j F 
is precisely assigned to the set of simple hyperarcs leaving u inTl, so that the loop 
from Lines [2(3] to [3s] iterates on the successors of u in graph (1-1). As a consequence, 
the algorithm TerminalScc2('H) behaves exactly like TERMiNALScc(graph('H)). 
Moreover, under our assumption, the terminal Sees of graph(H) are all reduced to 
singletons (otherwise, the loop from Lines [43] to [47] would be executed, and some 
vertices would be merged). Therefore, by Proposition [l] the statement in Proposi- 
tion [Ti] holds. 

Inductive Case. Suppose that the vertex merging loop is executed at least once, 
and that its first execution happens during the execution of, say, Visit2(:e). Con- 
sider the state of the algorithm at Line [41] just before the execution of the first 
occurrence of the vertex merging step. Until that point, Find(i>) is still equal to 
v for all vertices v £ V, so that the execution of TerminalScc('H) coincides with 
the execution of TERMINAL-Scc(graph(%)). Consequently, if C is the set formed 
by the vertices y located above x in the stack S (including x), C forms a terminal 
Sec of graph("H). In particular, the elements of C are located in a same Sec of the 
hypergraph H. 

Consider the hypergraph %' obtained by merging the elements of C in the hy- 
pergraph (V, A \ {a I By 6 C s.t. T(d) = {y}}), and let X be the resulting vertex. 
For now, we may add a hypergraph as last argument of the functions Visit2, Find, 
. . . to distinguish their execution in the context of the call to TerminalScc2('H) 
or TerminalScc2('H / ). We make the following observations: 

• the vertex x is the first element of the component C to be visited during the execu- 
tion of TerminalScc2('H). It follows that the execution of TerminalScc2("H) 
until the call to VlSIT2(a;, H) coincides with the execution of TerminalScc2('H') 
until the call to Visit2(X, W). 

• besides, during the execution of Visit2(:e, T-L), the execution of the loop from 
Lines [26] to pis] only has a local impact, i.e. on the is-term[y], index[y], or low[y] 
for y £ C, and not on any information relative to other vertices. Indeed, we claim 
that the set of the vertices y on which Visit2 is called during the execution of the 
loop is exactly C\ {x}. First, for all y € C\ {x}, Visit2(?/) has necessarily been 
executed after Line [26] (otherwise, by Invariant [2j y would be either below x in 
the stack S, or in Finished). Conversely, suppose that after Line|26j there is a call 
to ViSiT2(i) with t ^ C. By Invariant [2j t belongs to Finished, so that for one of 
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the vertices w examined in the loop, either w £ Finished or is-term[w] = false 
after the call to ViSiT2(iy). Hence isJ,erm[x] should be false, which contradicts 
our assumptions. 

• finally, from the execution of Line [53] during the call to VlSIT2(x, %), our algo- 
rithm behaves exactly as TerminalScc2('H') from the execution of Line [ae] in 
VlSlT2(X, W). Indeed, index[X] is equal to i, and the latter is equal to n — 1. 
Similarly, for all y £ C, low[y] = i and is-term[y] — true. The vertex X being 
equal to one of the y £ C, we also have low[X] — i and isJ,erm[X] = true. 
Moreover, X is the top element of S. 

Furthermore, it can be verified that at Line|49j the set F contains exactly all 
the hyperarcs of A which generate the simple hyperarcs leaving X in Ti! : they 
are exactly characterized by 

Find(z,"H) = X for all z £ T(a), and T(a) ^ {y} for all y £ C 
Find(z,'H) = X for all z £ T(a), and collected a = false 

since at that Line|49j a hyperarc a satisfies collected a = true if, and only if, T(a) 
is reduced to a singleton {t} such that index[t] is defined. 

Finally, for all vertices y £ C, FiND(y,%) can be equivalently replaced by 
Fwn(X,n'). 

As a consequence, TerminalScc2(%) and TerminalScc2('H') return the same 
result. Both functions perform the same union- find operations, except the first the 
vertex merging step executed by TerminalScc2('H) on C. 

Let / be the function which maps all vertices y £ C to X, and any other vertex 
to itself. We claim that %' and f(H) have the same reachability graph, i.e. ^>u' 
and ~**f(n) are identical relations. Indeed, the two hypergraphs only differ on the 
images of the hyperarcs a € A such that T(a) = {y} for some y £ C . For such 
hyperarcs, we have H(a) C C, because otherwise, isjterm[x] would have been set 
to false (i.e. the Sec C would not be terminal). It follows that their are mapped to 
the cycle ({^}, {X}) by /, so that H' and f{T-L) clearly have the same reachability 
graph. In particular, they have the same terminal Sees. 

Finally, since the elements of C are in a same Sec of H, Proposition [3] shows 
that the function / induces a one-to-one correspondence between the Sees of H 
and the Sees of /(H): 

D > /(D) 

(D'\{X})UC< — 1 D' ifXeD' 
D' < — 1 D' otherwise. 

The action of the function / exactly corresponds to the vertex merging step per- 
formed on C. Since by induction hypothesis, TerminalScc2(H') determines the 



terminal Sees in f(7i), it follows that Proposition 14 holds. □ 



The second intermediary version of our algorithm, TerminalScc3, is based on 
the first one, but it performs the same computations on the auxiliary data r a and 
c a as in TerminalScc. However, the latter are never used, because at Line [62J 
F is re-assigned to the value provided in TerminalScc2. It follows that for now, 
the parts in gray can be ignored. The following lemma states that TerminalScc2 
and TerminalScc3 are equivalent: 
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1: function TerminalScc3(V, A) 

2: n :— 0, S := [], Finished := 

3: for all a G A do 

4: r a :— undef. c a :— 

5: collected a '■— false 

6: done 

7: for all u G V do 

S: mdea;[u] :— undef , /ouj[u] :— undef 

9: Makeset(m), F„ := [] 

10: done 

11: for all u G V do 

12: if mdea;[u] — undef then 

13: Visit3(«) 

14: end 

15: done 

16: end 

17: function Visit3(u) 

18: local U := Find(m), local F := [] 

19: index[U] :— n, low[U] :— n, n :— n -f- 1 

20: «s_£erm[C7] :— irue 

21: push [/ on the stack S 

22: for all a G A n do 

23: if \T(a)\ = 1 then push oonF 

24: else 

25: if r a — undef then r a :— u 

26: local R a := FiND(r a ) 

27: if i? a appears in S then 

28: c a := c a + 1 

29: if c a = \T{a)\ then 

30: push a on the stack Fn a 

31: end 

32: end 

33: end 

34: done 

35: for all o G F do 

36: collecteda '■— true 

37: done 

Figure 7. Second intermediary form of our linear algorithm on hypergraphs 

Proposition 15. Let % be a directed hypergraph. After the execution of the algo- 
rithm TerminalScc3(%), the sets {v e V | Find(w) = U and isJerm[U] — true} 
precisely correspond to the terminal Sees ofH. 

Proof. When Visit3(u) is executed, the local stack F is not directly assigned to the 
set {a G A | T{a) = {u}} (see Line [22] in Figure [6]), but built by several iterations 
on the set A u (Line [23]). Since u € T(a) and \T(a)\ = 1 holds if, and only if, T(a) is 
reduced to {u}, Visit3(u) initially fills F with the same hyperarcs as Visit2(w). 

Besides, the condition nojmerge = false in Visit2 (Lincjsi]) is replaced by F 7^ 
(Linejil]). We claim that the condition F 7^ can be safely used in Visit2 as 
well. Indeed, in Visit2, F 7^ implies nojmerge — false. Conversely, suppose 
that in Visit2, nojmerge — false and F — 0, so that the algorithm goes back 
to Line [53] after having no -merge to true. The loop from Lines [26] to [3s] is not 
executed since F = 0, and it directly leads to a new execution of Lines [39] to |si| with 
nojmerge — true. Therefore, going back to Line [53] was useless. 

Finally, during the vertex merging step in Visit3, n keeps its value, which is 
greater than or equal to i + 1, but is not necessarily equal to i + 1 like in Visit2 
(just after Line [52]). This is safe because the whole algorithm only need that n take 
increasing values, and not necessarily consecutive ones. 

We conclude by applying Proposition [l4j □ 



38: while F is not empty do 

39: pop a from F 

40: for all w G H(a) do 

41: local W := Find(-uj) 

42: if Zow[IV] = undef then Visit3(u;) 

43: if We Finished then 

44: is_term[U] :— false 

45: else 

46: low[U] := min(iou) [U] , low [W]) 

47: is _term[U]:— is _term[U] &&zis _term\W] 

48: end 

49: done 

50: done 

51: if low[U] — index[U] then 

52: if is_term[U] — true then 

53: local i :— index[U] 

54: pop each a G Fjj and push it on F 

55: pop V from S 

56: while index[V] > i do 

57: pop each a G Fy and push it on F 

58: U := Merge((7, V) 

59: P°P V from S 

60: done 

61: index[U] :— i, push U on S 

62: _F : = 



( I collecteda — false. 1 

\ a E A I Va; G T(a), FlND(» = V\ 



63: for all a G F do collecte da '■ = true 

64: if F then go to Line [38] 

65: end 

66: repeat 

67: pop V from 5, add V to Finished 

68: until index[V] — index[U] 

69: end 

70: end 
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We make similar assumptions on the atomicity of the sequences of instructions. 
Note that Invariant [T] [2] and [3] still holds in Visit3. 

Invariant 4. Let a £ A such that \T(a)\ > 1. If for all x € T(a), index[x] is 
defined, then the root r a is defined. 

Proof. For all x £ T(a), Visit3(x) has been called. The root r a has necessarily 
been defined at the first of these calls (remember that the block from Lines [is] to [37] 
is supposed to be executed atomically). □ 

Invariant 5. Consider a state cur of the algorithm in which U £ Finished. Then 
any vertex reachable from U in graph("H cnr ) is also in Finished. 

Proof. The invariant clearly holds when U is placed in Finished. Using the atomic- 
ity assumptions, the call to Visit3(u) is necessarily terminated. Let old be the state 
of the algorithm at that point, and Hold and Finished id the corresponding hyper- 
graph and set of terminated vertices at that state respectively. Since Visit3(u) has 
performed a depth-first search from the vertex U in graph('H ;d), all the vertices 
reachable from U in Hold stand in Finished a u ■ 

We claim that the invariant is then preserved by the following vertex merging 
steps. The graph arcs which may be added by the latter leave vertices in S, and 
consequently not from elements in Finished (by Invariant [2]) . It follows that the set 
of reachable vertices from elements of Finished a u is not changed by future vertex 
merging steps. As a result, all the vertices reachable from U in graph("H cur ) are 
elements of Finished a id- Since by Invariant [5j Finished Q id C Finished, this proves 
the whole invariant in the state cur. □ 

Invariant 6. In the digraph graph("H cur ), at the call to Visit3(u), u is reachable 
from a vertex W such that index[W] is defined if, and only if, W belongs to the 
stack S. 

Proof. The "if" part can be shown by induction. When the function Visit3(u) is 
called from Line [13 the stack S is empty, so that this is obvious. Otherwise, it 
is called from Line 42] during the execution of Visit3(x). Then X = Find(ie) is 
reachable from any vertex in the stack, since x was itself reachable from any vertex 
in the stack at the call to Find(X) (inductive hypothesis) and that this reachability 
property is preserved by potential vertex merging steps (Proposition [3]) . As u is 
obviously reachable from X, this shows the statement. 

Conversely, suppose that index[W) is defined, and W is not in the stack. Accord- 
ing to Invariant [2] W is necessarily an element of Finished. Hence u also belongs 
to Finished by Invariant [5j which is a contradiction since this cannot hold at the 
call to Visit(m). □ 

Invariant 7. Let a £ A such that \T(a)\ > 1. Consider a state cur of the algorithm 
TerminalScc3 in which r a is defined. 

Then c a is equal to the number of elements x £ T(a) such that index[x] is defined 
and Find(x) is reachable from FiND(r Q ) in graph('H cur ). 

Proof. Since at Line[28j c a is incremented only if R a = FiND(r a ) belongs to S, we 
already know using Invariant [6] that c a is equal to the number of elements x £ T(a) 
such that, at the call to ViSiT3(a;), x was reachable from FlND(r Q ). 

Now, let x £ V, and consider a state cur of the algorithm in which r a and 
index[x] are both defined, and FlND(r Q ) appears in the stack S. Since index[x] 
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is defined, Visit3 has been called on x, and let old be the state of the algorithm 
at that point. Let us denote by H u and H cur the current hypergraphs at the 
states old and cur respectively. Like previously, we may add a hypergraph as 
last argument of the function Find to distinguish its execution in the states old 
and cur. We claim that FiND(r a , H CU r) ~^graph(-H c „ r ) FIND (a;, H C ur) if, and only if, 
Find (r n , H old) ~^graph(K M) x - The "if" part is due to the fact that reachability in 
graph (Hold) is not altered by the vertex merging steps (Proposition [3|. Conversely, 
if x is not reachable from FiND(r a , Hold) in Hold, then FiND(r a , Hold) is not in 
the call stack S id (Invariant [6]), so that it is an element of Finished u- But 
Finishedoid ^= Finished cur , which contradicts our assumption since by Invariant [2] 
an element cannot be stored in Finished cur and S cur at the same time. It follows 
that if r a is defined and FiND(r a ) appears in the stack S, c a is equal to the number of 
elements x G T(a) such that index[x] is defined and FiND(r a ) -^ g raph(« cur ) Find(x). 

Let cur be the state of the algorithm when FiND(r a ) is moved from S to 
Finished. The invariant still holds. Besides, in the future states new, c a is not incre- 
mented because FiND(r a , H cur ) G Finished cur C Finished nem (Invariant [3]) , so that 
FiND(r Q , H n ew) = FiND(r a ,H C ur) , and the latter cannot appear in the stack S new 
(Invariant [2]). Furthermore, any vertex reachable from R a = FiND(r Q , H new ) in 
graph (Hnew) belongs to Finished new (Invariant [5]) . It even belongs to Finished cur , 
as shown in the second part of the proof of Invariant [5] (emphasized sentence) . It 
follows that the number of reachable vertices from FiND(r a ) has not changed be- 
tween states cur and new. Therefore, the invariant on c a will be preserved, which 
completes the proof. □ 

Proposition 16. In Visit3 ; the assignment at Line^6^does not change the value 
ofF. 

Proof. It can be shown by strong induction on the number p of times that this line 
has been executed. Suppose that we are currently at Line [53J and let X\ , . . . , X q 
be the elements of the stack located above the root U = X\ of the terminal Sec of 
graph(H cnr ). Any arc a which will transferred to F from Line[s3]to Line [6o| satisfies 
c a = \T(a)\ > 1 and FiND(r a ) = Xi for some 1 < i < q (since at [53J F is initially 
empty). Invariant [7] implies that for all elements x G T(a), Find(x) is reachable 
from Xi in graph(H cur ), so that by terminality of the Sec C — {X\, . . . ,X q }, 
Find(x) belongs to C, i.e. there exists j such that Find(:e) = Xj. It follows that 
at LinelcioJ FlND(a;) = U for all x G T(a). Then, we claim that collected a = false 
at LineH Indeed, a' G A satisfies collected a > — true if, and only if: 

• either it has been copied to F at Line [23} in which case |T(a')| = 1, 

• or it has been copied to F at the r-th execution of Line [62J with r < p. By 
induction hypothesis, this means that a' has been pushed on a stack Fx and 
then popped from it strictly before the r-th execution of Line [e2j 

Observe that a given hyperarc can be popped from a stack F x at most once during 
the whole execution of TerminalScc3. Here, a has been popped from Fx t after 
the p-th execution of Line|62j and |T(a)| > 1. It follows that collected a = false. 

Conversely, suppose for that, at Line[62j collected a = false, and all the x G T(a) 
satisfies Find(x) = U. Clearly, |T(a)| > 1 (otherwise, a would have been placed 
into F at Line [23] and collecteda would be equal to true). Few steps before, at 
Line [53J Find(x) is equal to one of Xj, 1 < j < q. Since index[Xj] is defined 
(Xj is an element of the stack S) , by Invariant 111 index [x] is also defined for all 
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x € T(a), hence, the root r a is dehned by Invariant [4j Besides, FiND(Y a ) is equal 
to one of the Xj, say X^ (since r a g T(a)). As all the Find(x) are reachable 
from FiND(r a ) in graph('H CIir ), then c a — \T(a)\ using Invariant [7] It follows that 
a has been pushed on the stack Fr u , where R a = FiND(r a , Hold) in an previous 
state old of the algorithm. As collected a = false, a has not been popped from Fn a , 
and consequently, the vertex R a of T-L id has not involved in a vertx merging step. 
Therefore, R a is still equal to FiND(()r a , H cur ) = X^. It follows that at Line[s3j a 
is stored in Fx k , and thus it is copied to F between Lines [53] and [6o| This completes 
the proof. □ 

We now can prove the correctness of TerminalScc. 

Theorem^ By Proposition [TBI Line [62] can be safely removed in Visit3. It follows 
that the booleans collected a are now useless, so that Line[iJ the loop from Lines [35] 
to [37] and Line [63] can be also removed. After that, we precisely obtain the algorithm 



TerminalScc. Proposition 15 completes the proof. □ 
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